Rank | Count | Beginning |
---|---|---|
68879 | 12368 | The |
25393 | 2770 | He |
34900 | 2372 | In |
30997 | 2287 | I |
38769 | 2169 | It |
31 | 1883 | A |
12164 | 1774 | But |
85864 | 1690 | This |
91762 | 1297 | We |
4313 | 1187 | And |
84188 | 1156 | They |
62181 | 1091 | She |
32084 | 985 | If |
30995 | 975 | “I |
7177 | 923 | As |
91764 | 810 | “We |
40871 | 774 | It’s |
68885 | 763 | “The |
79180 | 736 | There |
22027 | 724 | For |
54096 | 680 | On |
95436 | 661 | When |
67747 | 658 | That |
98782 | 639 | You |
96439 | 558 | While |
1670 | 546 | After |
8696 | 546 | At |
64221 | 503 | So |
30153 | 488 | However, |
429 | 477 | According |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV